Vocal-tract area-function parameters from formant frequencies
نویسندگان
چکیده
The basic three-pararneter vocal tract area function model of Fant [2] has been extended to allow for asymmetry and variable length of the tongue hump section as well as variations of the VT overalllength. An important class of articulations are those that conform with this model, mostly vowel sounds and consonants wi~hout . apical . modification or a secondary posterior art1culatton. Starmg out from a theory of small perturbations [1,4], we calculate the shift in the VT pararneter accompanying a certain set of shifts in FJ, Fz, and F3. This is done from a set of three linear differential equations expressing the shifts in each of F b Fz, and F3 as the sum of the contributions from the unknown shifts in each of the VT pararneters. These are solved at each small step of a pathway from a reference starting point to the target formant-pattem. The overall length is optimized by reference to F 4 and some available measured data. . A resynthesis of a time varying F-pattem from a set of articulatory targets has been attempted with a possible application in articulatory synthesis. l. INTRODUCTION The purpose of this paper is to infer VT configurations from fonnant frequencies for vowel-like sounds. It is well known that without additional appropriate constraints there exist an infinite nurober of solutions and that many of them are completely unrealistic. In the present study, the constraint imposed is that a VT configuration should be generated by means of a pararnetric VT model. For this purpose, the threepararneter VT model of Fant [2] is extended to be more feasible to deal with and capture more essential features of the production of vowels. In addition to the overall tract length, /"", there are still three main independent pararneters in the present version. Altogether, they specify the size (Ac) and location (Xe) of the VT constriction and the state of the lip section (lo/Ao). Once these pararneters are known, the underlying VT prof'lle, in tenns of area function is defined. To derive these articulatory pararneters, a set of linear differential equation are formulated. The basic algorithm is to first determine the arnount of variation in formant frequency due to a (small) known perturbation of the articulatory pararneters (narnely, the sensitivity function). This knowledge is then made use of to determine the variation in these pararneters for a desired shift of the fonnant frequency. It is thus an iterative procedure. For the time being, we use the frrst three formant frequencies to solve for the three model pararneters, Ac, Xe, and lo/Ao. The overall length of the VT ltot is assumed to be known in advance and is optimized with reference to F4. The relation between the acoustic and articulatory data is non-linear. If the step of formant frequency during one iteration is too large, then this linear algorithm may fall to converge. However, the gross linearity of the data between the acoustic and articulatory domains can be preserved by dividing that step into a nurober of smaller sub-steps [ 1]. Moreover, the success of the algorithm depends highly upon an appropriate choice of the starting position for the itetative calculation. A proper selection of this starring position can not only minimize the distance between the initial point and the t~get (given) formant-pattem thereby reducing computation tlme, but also reduces the risks that searching process falls in the so called "forbidden area" [1). At present, three rules are implemented to assign a starting position based on the given information of F 1 and Fz. The inverse transfomtation approach has been tested on a set of Swedish vowels [3]. Reasonable VT area functions have been inferred. Wehave also attempted to derive the VT area function for the Russian vowel [a] whose area function is known [2]. The result indicates that an asymmetric tonguehump outline is better suited for this vowel than a symmetric one, in view of both the resultant F4 and area function. . The VT pararnetric model has also been applied to an articulatory synthesis scheme. There are two ways of deriving the time varying VT area function. One is directly from analysis of the F-pattem sarnpled at suitable instants followed by a recovery of VT area functions from the inverse transformation. The second method, which is oriented towards the development of rule systems, is to interpolate Ac, Xe, lo/Ao, and ltot on a basis of prescribed articulatory targets (anchoring points), by resorting to some weighting functions. It is found that such interpolation renders a better result if an ~termediat~ state is included. Syn~esis of dynamic pattem of diphthong-like syllables such a~ (ja] has been attempted in these two ways. 2. THE VT MODEL The curvature of the tongue hump need not be one and the sarne for all of vocalic sounds. Thus, the frrst extension of Fant's VT model [2) is to allow for asynunetry and variable length of the tongue hump [7]. The second modification is the utilization of circular functions instead of hyperbolic ones, for the specification of the hom-shaped constriction. This is to gain computation efficiency. The area function for the hom anterior and posterior to the constriction centre, A(x), is expressed in one of the two alternative fonns: A(x) =Ac+ (8AJ-[1cos(0.1·Rr1t·x/Xe)] (1a) A(x) =Ac+ (8AcH1cos(0.1·Rr1t·x/Xe)] 2 (1b) where Re and Xe control the flaring rate of the area function from the constriction centre Xe· Xe also detennines the extension of the hump region. Only a portion of the cosine waveform is used thereby avoiding an oscillatory profile and allowing ~or an adjustable length of the hump section. The cross-secttonal area beyond the hump region is set to 8 cm2. The displacement form Xe is denoted x. For the hom section posterior to the constriction centre, Re and Xe are replaced by Rb and Xb, respectively. Note that for a given Ac, Re and Xt eq. (1 b) delivers a Ionger effective constriction length than eq. (1a) does. It is found that for the back vowels eq. (1a) is more
منابع مشابه
Estimating the Vocal-Tract Area Function From Formants Using a Sensitivity Function and Least Square
We present a method for estimating the vocal-tract area function from specified formant frequencies. The method extends the work of Story (J.A.S.A., 119, 715-718, 1996) based on a sensitivity function representing the change in the formant frequency due to a small perturbation of the cross-sectional area of the vocal tract. Our method estimates the vocal-tract shape through an iterative procedu...
متن کاملEstimation of Vocal Tract Area Function from Magnetic Resonance Imaging: Preliminary Results
A method has been developed for three-dimensional reconstruction of the vocal tract shape and for the calculation of area function from magnetic resonance imaging (MRI). MR images were acquired and analyzed for 6 German long vowels uttered by one subject. The resulting vocal tract area functions are the basis for the calculation of formant frequencies. Conformity with acoustically measured form...
متن کاملParametric model of VT area functions: vowels and consonants
This is an extension of earlier work on vocal tract area function modelling (Fant, 1960, 1992, 1993; Lin, 1990) retaining a minimum of three independent control parameters in a more complex, physiologically oriented model providing a flexible choice of detail structure. Systematic perturbation analysis of vocal tract boundary conditions and of values of the independent control parameters have b...
متن کاملParameterized VT area function inversion
The purpose of our study is to contribute tools for inversion of articulatory to acoustics relations, in specific to perform an estimate of vocal tract area-function parameters from formant frequencies. The inversion is performed in two steps. A first approximation is attained from either a codebook or a neural net and a final optimization is performed by an iterative interpolation for finding ...
متن کاملDetermination of the vocal-tract shape from measured formant frequencies.
We model the vocal tract as a lossless acoustic tube and consider the relationship between the resonant frequencies and the cross-sectional area function. Empirical results show that if the logarithm of the area function is band limited preserving only 2n Fourier components, the lowest n pole and n zero frequencies of the admittance function measured at the lips uniquely determine the area coef...
متن کاملTechnique for "tuning" vocal tract area functions based on acoustic sensitivity functions.
A technique for modifying vocal tract area functions is developed by using sum and difference combinations of acoustic sensitivity functions to perturb an initial vocal tract configuration. First, sensitivity functions [e.g., Fant and Pauli, Proc. Speech Comm. Sem. 74, 1975] are calculated for a given area function, at its specific formant frequencies. The sensitivity functions are then multipl...
متن کامل